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We investigate the possibility of deriving metric trace semantics in a coalgebraic framework. 
First, we generalize a technique for systematically lifbng functors from the category Set of 
sets to the category PMet of pseudometric spaces, by identifying condihons under which 
also natural transformations, monads and distribuhve laws can be lifted. By exploiting some 
recent work on an abstract determinization, these results enable the derivation of frace mefrics 
sfarting from coalgebras in Set. More precisely, for a coalgebra on Set we determinize it, 
thus obtaining a coalgebra in the Eilenberg-Moore category of a monad. When the monad 
can be lifted to PMet, we can equip the final coalgebra with a behavioral distance. The trace 
distance between two states of the original coalgebra is the distance between their images in the 
determinized coalgebra through the unit of fhe monad. We show how our framework applies 
fo nondeferminisfic aufomafa and probabilisfic aufomafa. 
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1. Introduction 

When considering the behavior of state-based system models embodying quantitative 
information, such as probabilities, time or cost, the interest normally shifts from 
behavioral equivalences to behavioral distances. In fact, in a quantitative setting, it 
is often quite urmatural to ask that two systems exhibit exactly the same behavior, 
while it can be more reasonable to require that the distance between their behaviors 
is sufficiently small (see, e.g., [GJS90, DGJP04, VBW05, BBLM15, dAFSoq, dAFSoq, 
FLTii]). 


*This is an extended version of [BBKK15]. It consists of the material of the original paper and an 
appendix containing the proofs of the presented results. 
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1. Introduction 


Coalgebras [Rutoo] are a well-established abstract framework where a canonical 
notion of behavioral equivalence can be uniformly derived. The behavior of a system is 
represented as a coalgebra, namely a map of the form X —^ HX, where X is a state space 
and H is a functor that describes the type of computation performed. For instance 
nondeterministic automata can be seen as coalgebras X —)> 2 x IP(X)'^: for any state 
we specify whether it is final or not, and the set of successors for any given input in 
A. Under suitable conditions a final coalgebra exists which can be seen as minimized 
version of the system, so that two states are deemed equivalent when they correspond 
to the same state in the final coalgebra. 

In a recent paper [BBKK14] we faced the problem of devising a framework where, 
given a coalgebra for an endofunctor H on Set, one can systematically derive pseu¬ 
dometrics which measure the behavioral distance of states. A first crucial step is the 
lifting of H to a functor H on PMet, the category of pseudometric spaces. In particular, 
we presented two different approaches which can be viewed as generalizations of the 
Kantorovich and Wasserstein pseudometrics for probability measures. One can prove 
that the final coalgebra in Set can be endowed with a metric, arising as a solution 
of a fixpoint equation, turning it into the final coalgebra for the lifting H. Since any 
coalgebra X —)■ HX can be seen as a coalgebra in PMet by endowing X with the discrete 
metric, the unique mapping into the final coalgebra provides a behavioral distance on X. 

The canonical notion of equivalence for coalgebras, in a sense, fully captures the 
behavior of the system as expressed by the functor H. As such, it naturally corresponds 
to bisimulation equivalences already defined for various concrete formalisms. Some¬ 
times one is interested in coarser equivalences, ignoring some aspects of a computation, 
a notable example being trace equivalence where the computational effect which is 
ignored is branching. 

In this paper, relying on recent work on an abstract determinization construction 
for coalgebras in [SBBR13, JSS12, JSS15], we extend the above framework in order to 
systematically derive trace metrics. The mentioned work starts from the observation 
that the distinction between the behavior to be observed and the computational effects 
that are intended to be hidden from the observer, is sometimes formally captured 
by splitting the functor H characterizing system computations in two components, a 
functor F for the observable behavior and a monad T describing the computational 
effects, e.g., lifting 1 -I —, the powerset functor T or the distribution functor D provides 
partial, nondeterministic or probabilistic computations, respectively. For instance, 
the functor for nondeterministic automata 2 x IP(X)'^ can be seen as the composition 
of the functor FX = 2 x X'^, describing the transitions, with the powerset monad 
T = T, capturing nondeterminism. Trace semantics can be derived by viewing a 
coalgebra X —)• 2 x T’(X)'^ as a coalgebra TlX) —)• 2 x T’(X)'^, via a determinization 
construction. Similarly probabilistic automata can be seen as coalgebras of the form 
X —^ [ 0 , 1 ] X yielding coalgebras 2 )(X) —^ [ 0 , 1 ] x 2 )(X]'^ via determinization. 

On this basis, [JSS15] develops a framework for deriving behavioral equivalences 
which only considers the visible behavior, ignoring the computational effects. The core 
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idea consists in "incorporating" the effect of the monad also in the set of states X, which 
thus becomes TX, by means of a construction that can be seen as an abstract form of 
determinization. For functors of the shape FT, this can be done by lifting F to a functor 
F in £M(T], the Eilenberg-Moore category of T, using a distributive law between F and 
T. In fact, the final F-coalgebra lifts to the final F-coalgebra in £!M(T). The technique 
works, at the price of some complications, also for functors of the shape TF [JSS15]. 

Here, we exploit the results in [JSS15] for systematically deriving metric trace se¬ 
mantics for Set-based coalgebras. The situation is summarized in the diagram at the 
end of Subsection 5.1. As a first step, building on our technique for lifting functors 
from the category Set of sets to the category PMet of pseudometric spaces, we identify 
conditions under which also natural transformations, monads and distributive laws can 
be lifted. In this way we obtain an adjunction between PMet and £M(T), where T is the 
lifted monad. Via the lifted distributive law we can transfer a functor F: PMet —)■ PMet 
to an endofunctor F on £M(T]. By using the trivial discrete distance, coalgebras of the 
form TX —FTX can now live in £M(T) and can be equipped with a trace distance via 
a map into the final coalgebra. This final coalgebra is again obtained by lifting the final 
F-coalgebra, i.e. a coalgebra equipped with a behavioral distance, to £M(T). 

The trace distance between two states of the original coalgebra can then be defined as 
the distance between their images in the determinized coalgebra through the unit of the 
monad. We illustrate our framework by thoroughly discussing two running examples, 
namely nondeterministic automata and probabilistic automata. We show that it allows 
us to recover known or meaningful trace distances such as the standard ultrametric 
on word languages for nondeterministic automata or the total variation distance on 
distributions for probabilistic automata. 

The paper is structured as follows. In Section 2 we will introduce our notation and 
quickly recall the basics of our lifting framework from [BBKK14]. Then, in Section 3, 
we tackle the question of compositionality, i.e. we investigate whether based on liftings 
of two functors we can obtain a lifting of the composed functor. The lifting of natural 
transformations and monads is treated in Section 4. Equipped with these tools, we show 
as main result in Section 5 how to obtain trace pseudometrics in the Eilenberg-Moore 
category of a lifted monad. We conclude our paper with a discussion on related and 
future work (Section 6). Proofs can be found in Appendix P. 


2. Preliminaries 

In this section we recap some basic notions and fix the corresponding notation. We also 
briefly recall the results in [BBKK14] which will be exploited in the paper. 

We assume that the reader is familiar with the basic notions of category theory, 
especially with the definitions of functor, product, coproduct and weak pullbacks. 

For a function f : X —^ Y and sets A C X, B C Y we write f[A] := {f(a) | a G A} for the 
image of A and f“MB] = {a G A | f (x) G B} for the preimage of B. Finally, if Y C [ 0 , c»] 
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and f, g: X —> Y are functions we write f ^ g if Vx G X : f(x) ^ g(x). 

A probability distribution on a given set X is a function P: X —)> [0,1] satisfying 
Hxex set B C X we define P(B) = The support of P is the 

set supp(P) := {x G X I P(x] > 0}. 

Given a natural number n G N and a family (Xi)lYi of sets Xt we denote the projec¬ 
tions of the (cartesian) product of the X| by Tt^: OlXi Tor a source (ft: X —^ 

we denote the unique mediating arrow to the product by (fi,.. .,fn): X —^ 
nr=l Similarly, given a family of arrows (fi: Xt —)• we write fi x • • • x frt = 

(f 1 o 7 ti,..., fn o Tin): nr=i Xi ^ nr=i yi. 

For T G ( 0 , oo] and a set X we call any function d: X^ —)• [ 0 , T] a (T-)distance on X 
(for our examples we will use T = 1 or T = oo). Whenever d satisfies, for all x,y,z G X, 
d(x,x) = 0 (reflexivity), d(x,y) = d(g,x) (symmetry) and d(x,g) ^ d(x,z) -I- d(z,g) 
(triangle inequality) we call it a pseudometric and if it additionally satisfies d(x,g) = 
0 X = y we call it a metric. Given such a function d on a set X, we say that 

(X, d) is a pseudometric/metric space. By dgi [ 0 , T]^ —)■ [ 0 ,T] we denote the ordinary 
Euclidean distance on [ 0 , T], i.e., de(x,g) = |x —g| for x,g G [ 0 , T] \{oo}, and - where 
appropriate - de(x, oo) = oo if x / oo and de(oo, oo) = 0 . Addition is defined in the 
usual way, in particular x -I- oo = oo for x G [ 0 , oo]. We call a function f: X —^ Y between 
pseudometric spaces (X, dx) and (Y, dy) nonexpansive and write f: (X, dx) -V (Y, dy) if 
dy o (f X f) ^ dx. If equality holds we call f an isometry. 

By choosing a fixed maximal element T in our definition of distances, we ensure 
that the set of pseudometrics over a fixed set with pointwise order is a complete lattice 
(since [0, T] is) and we obtain a complete and cocomplete category of pseudometric 
spaces and nonexpansive functions, which we denote by PMet. Given a functor F on 
Set, we aim at constructing a functor F on PMet which is a lifting of F in the following 
sense. 

Definition 2.1 (Lifting). Let U: PMet —^ Set be the forgetful functor which maps every 
pseudometric space to its underlying set. A functor F: PMet — PMet is called a lifting 
of a functor F: Set — Set if it satisfies UF = FU. 

Similarly to predicate lifting of coalgebraic modal logic [SchoS], lifting to PMet can be 
conveniently defined once a suitable (evaluation) function from F[0, T] to [0, T] is fixed. 

Definition 2.2 (Evaluation Function & Evaluation Functor). Let F be an endofunctor 
on Set. An evaluation function for F is a function evp: F[0, T] —)• [0, T]. Given such a 
function, we define the evaluation functor to be the endofunctor F on Set/[0, T], the slice 
category^ over [0, T], via F(g) = evp o Fg for all g G Set/[0, T]. On arrows F is defined 
as F. 


^The slice category Set/[0, T] has as objects all functions g: X —t [0, T] where X is an arbitrary set. Given 
g as before and h: Y [0, T], an arrow fromgtohisa function f: X —Y satisfying h o f = g. 
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A first lifting technique leads to what we called the Kantorovich pseudometric, which is 
the smallest possible pseudometric d*" on FX such that, for all nonexpansive functions 
f: (X, d) -!> ([ 0 ,T], de), also Ff: (FX, d'") -V ([ 0 ,T], dg] is again nonexpansive. 

Deeinition 2.3 (Kantorovich Pseudometric & Kantorovich Lifting). Let F: Set — > Set be 
a functor with an evaluation function evp. For every pseudometric space (X, d) the 
Kantorovich pseudometric on FX is the function FX x FX —^ [ 0 , T], where for all 
ti,t2 G FX: 

d^f(ti,t2] :=sup{de(Ff(ti),Ff(t2)] If: (X, d] _!> ([ 0 ,T], dg)} • 

The Kantorovich lifting of the functor F is the functor F: PMet — PMet defined as 
F(X, d) = (FX, d^^) and Ff = Ff. 

This definition is sound i.e. d^'" is guaranteed to be a pseudometric so that we indeed 
obtain a lifting of the functor. A dual way for obtaining a pseudometric on FX relies on 
ideas from probability and transportation theory. It is based on the notion of couplings, 
which can be understood as a generalization of joint probability measures. 

Definition 2.4 (Coupling). Let F: Set —> Set be a functor and n G N. Given a set X and 
t| G FX for 1 ^ 1 ^ n we call an element t G F(X^) such that F7t|(t) = a coupling of 
the ti (with respect to F). We write rp(ti,t2,. . .,tn) for the set of all these couplings. 

Based on these couplings we are now able to define an alternative distance on FX. 

Definition 2.5 (Wasserstein Distance & Wasserstein Lifting). Let F: Set —> Set be a functor 
with evaluation function evp. For every pseudometric space (X, d) the Wasserstein 
distance on FX is the function d'*'^: FX x FX —)• [ 0 , T] given by, for all ti, t2 G FX, 

d'‘'’'(ti,t2) := inf |Fd(t) | t G rF(ti,t2]| . 

If d'*''^ is a pseudometric for all pseudometric spaces (X, d), we define the Wasserstein 
lifting of F to be the functor F: PMet —> PMet, F(X, d) = (FX, d'*''"), Ff = Ff. 

The names Kantorovich and Wasserstein used for the liftings derive from transportation 
theory [Viloq]. Indeed we obtain a transport problem if we instantiate F with the 
distribution functor T) (see also Example 2.9 below). In order to measure the distance 
between two probability distributions s, t: X —)• [ 0 , 1 ] it is useful to think of the following 
analogy: assume that X is a collection of cities (with distance function d between them) 
and s,t represent supply and demand (in units of mass). The distance between s,t can 
be measured in two ways: the first is to set up an optimal transportation plan with 
minimal costs (also called coupling) to transport goods from cities with excess supply 
to cities with excess demand. The cost of transport is determined by the product of 
mass and distance. In this way we obtain the Wasserstein distance. A different view is 
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to imagine a logistics firm that is commissioned to handle the transport. It sets prices 
for each city and buys and sells for this price at every location. However, it has to 
ensure that the price function (here, f) is nonexpansive, i.e., the difference of prices 
between two cities is smaller than the distance of the cities, otherwise it will not be 
worthwhile to outsource this task. This firm will attempt to maximize its profit, which 
can be considered as the Kantorovich distance of s, t. The Kantorovich-Rubinstein 
duality informs us that these two views lead to the exactly same result 
In Definition 2.5 we are not guaranteed, in general, that is a pseudometric. 
This is the case if we require F to preserve weak-pullbacks and impose the following 
restrictions on the evaluation function. 

Definition 2.6 (Well-Behaved). Let F be a functor with an evaluation function evp. We 
call evp well-behaved if it satisfies the following conditions: 

Wi. F is monotone, i.e., for f, g: X —[ 0 , T] with f ^ g, we have Ff ^ Fg. 

W2. For each t G F([ 0 ,T]^) it holds that de(evF(ti ], evF(t2)] ^ Fde(t) for ft := F7tt(t). 

W3. [{ 0 }] = Fi[F{ 0 }] where i: { 0 } [ 0 , T] is the inclusion map. 

While condition Wi is quite natural, for W2 and W3 some explanations are in order. 
Condition W2 ensures that Fid[o,T] = ^vp: F[ 0 , T] —[ 0 , T] is nonexpansive once dg is 
lifted to F[ 0 , T] (recall that for the Kantorovich lifting we require Ff to be nonexpansive 
for any nonexpansive f). Condition W3 requires that exactly the elements of F{ 0 } are 
mapped to 0 via evp. This is necessary for reflexivity of the Wasserstein pseudometric. 
Indeed, with this definition at hand we were able to prove the desired result. 

Proposition 2.7 ([BBKK14]). If? preserves weak pullbacks and evp is well-behaved, then 
d'*''" is a pseudometric for any pseudometric space (X, d). 

From now on, whenever we use the Wasserstein lifting d'*''^, we implicitly assume to 
be in the hypotheses of Proposition 2.7. It can be shown that, in general, d^'^ ^ d^^ 
Whenever equality holds we say that the functor and the evaluation function satisfy 
the Kantorovich-Rubinstein duality. This is helpful in many situations (e.g., in [vBWo6] 
it allowed to reuse an efficient linear programming algorithm to compute behavioral 
distance) but it is usually difficult to obtain. 

We now recall two examples which will play an important role in this paper. First, 
we consider the following bounded variant of the powerset functor. 

Example 2.8 (Finite Powerset). The finite powerset functor Tfin assigns to each set X the 
set TfiriK = {S C X I |S| < 00} and to each function f: X —> Y the function Tfinf: J’ftnK —^ 
J’finY, TfinflS) := f[S]. This functor preserves weak pullbacks and the evaluation 
function max: tPfin ([ 0 ,00]) —)■ [ 0 ,00] with max 0 = 0 is well-behaved. The Kantorovich- 
Rubinstein duality holds and the resulting distance is the Hausdorff pseudometric 
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which, for any pseudometric space (X, d) and any Xi,X2 G ^finX, is defined as 

dH(Xi,X2] = max < max min d(xi,X2), max min d(xi,X2) 
eXi x2eX2 x2eX2 ex, 

Our second example is the following finite variant of the distribution functor. 

Example 2.9 (Finitely Supported Distributions). The probability distribution functor 
2 ) assigns to each set X the set OX = {P: X —^ [ 0 , 1] | |supp(P)| < 00,P(X) = 1} and to 
each function f: X —Y the fimction Of: OX —OY, Of(P)(y) = Hxef-' [{y}] ~ 

P(f“^ [{y}]). O preserves weak pullbacks and the evaluation function ev®: O[ 0 , 1 ] —^ 
[ 0 , 1 ], evx)(P) = IIre[o i] ^well-behaved. For any pseudometric space (X, d) we 
obtain the Wasserstein pseudometric which, for any Pi, P2 G OX, is defined as 

d'‘'®(Pi,P2) = min < Y_ d(xi,X2) • P(xi,X2) 
l^x,,x2ex 

The Kantorovich-Rubinstein duality [Vilo9] holds from classical results in transportation 
theory. 

While these two functors can be nicely lifted using the theory developed so far, there 
are other functors that require a more general treatment. For instance, consider 
the endofunctor F = B x _ (left product with B) for some fixed B. Notice that for 
ti,t2 G FX = B X X with t| = (b|, xt) a coupling exists iff bi = b2. As a consequence, 
when bi 7^ b2, irrespectively of the evaluation function we choose and of the distance 
between xi and X2 in (X, d), the lifted Wasserstein pseudometric will always result 
in di'"(ti,t2] = T. This can be counterintuitive, e.g., taking B = [ 0 , 1 ], X / 0 and 
ti = ( 0 , x) and t2 = (£,x) for a small e > 0 and an x G X. The reason is that we think of 
B = [ 0 , 1 ] as endowed with a non-discrete pseudometric, like e.g. the Euclidean metric 
de, plugged into the product after the lifting. This intuition can be indeed formalized 
by considering the lifting of the product seen as a functor from Set x Set into Set. 
More generally, it can be seen that the definitions and results introduced so far for 
endofunctors in Set straightforwardly extend to multifunctors on Set, namely functors 
F: Set’^ —^ Set on the product category Set^ for a natural number n G N. For ease of 
presentation we will not spell out the details here (they are spelled out in [BBKK14]), 
but just provide an important example of a bifunctor (i.e. n = 2). 

Example 2.10 (Product Bifunctor). The weak pullback preserving product bifunctor 
F: Set^ —> Set maps two sets Xi,X2 to F(Xi,X2] = Xi x X2 and two functions ft: X^ —^ 
Yi to the function F(fi,f2) = fi x f2. In this paper we will use the well-behaved 
evaluation functions evp: [ 0 , 1 ]^ —)• [ 0 , 1 ] presented in the table below. Therein we also 
list the pseudometric (di,d2]'": Xi x X2 —^ [0,T] we obtain for pseudometric spaces 
(Xi,di), (X2,d2). 
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Parameters 

evF(ri,r2) 

(di,d2)''((xi,X2],(91,92)) 

Ci,C2 G ( 0 , 1 ] 

Ci,C2 G ( 0 , 1 ], Cl +C2^) 

max{ciri,C2r2} 

CiXi -1- C2X2 

max{ci di (xi,9i ),C2d2(x2,92)} 
Cidi (Xi, 9 i)-FC 2 d 2 (X 2 , 92 ) 


For Cl = C2 = h the first evaluation map yields exactly the categorical product in PMet. 
In both cases the Kantorovich-Rubinstein duality holds and the supremum [infimum] 
of the Kantorovich [Wasserstein] pseudometric is always a maximum [minimum]. 


3. COMPOSITIONALITY FOR THE WASSERSTEIN LIFTING 

Our first step is to study compositionality of functor liftings, i.e. we identify some 
sufficient conditions ensuring F G = FG. This technical result will be often very useful 
since it allows us to reason modularly and, consequently, to simplify the proofs needed 
in the treatment of our examples. We will explicitly only consider the Wasserstein 
approach which is the one employed in all the examples of this paper. 

Given evaluation functions evp and cvg, we can easily construct an evaluation 
function for the composition FG by defining evpG := FevG = evp o Fcvg- Our first 
observation is that, whenever F and G preserve weak pullbacks, well-behavedness is 
inherited. 

Proposition 3.1 (Well-Behavedness of Composed Evaluation Function). Let F, G be endofunc- 
tors on Set with evaluation functions evp, evG- If both functors preserve weak pullbacks and 
both evaluation functions are well-behaved then also evpG = cvp o Fcvg is well-behaved. 

In the light of this result and the fact that FG certainly preserves weak pullbacks if F 
and G do, we can safely use the Wasserstein lifting for FG. A sufficient criterion for 
compositionality is the existence of optimal couplings for G. 

Proposition 3.2 (Compositionality). Let F, G be weak pullback preserving endofunctors on 
Set with well-behaved evaluation functions evp, cvg and let (X, d) be a pseudometric space. 
Then d'*''"^ ^ Moreover, if for all ti,t2 G GX there is an optimal G-coupling, i.e. 

G rG(ti,t2) such that d'*'^(ti,t2) = Gd(y(ti,t2)), then = (d'*'^ 

This criterion will turn out to be very useful for our later results. Nevertheless it 
provides just a sufficient condition for compositionality as the next example shows. 

Example 3.3. We consider the finite powerset functor Tan of Example 2.8 and the 
distribution functor T) of Example 2.9 with their evaluation functions. Let (X, d) be a 
pseudometric space. 

1. We have d'*'®® = (d'*'®) , by Proposition 3.2, because optimal couplings always 

exist. 

2. We have although Tfin-couplings do not always exist. 
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Note that when we lift the functor we do not have couplings in the case when we 
determine the distance between an empty set 0 and a non-empty set Y C X, since there 
exists no subset of X x X that projects to both. 

Compositionality can be defined analogously for multifunctors. Again, we will not 
spell this out completely but we will use it to obtain the machine bifunctor. Before we 
can do that, we first need to define another endofunctor. 

Example 3.4 (Input Functor). Let A be a fixed finite set of inputs. The input functor 
F = Set — )• Set maps a set X to the exponential X'^ and a function f: X —)• Y 
to X'^ —)> Y'^, = fog. This functor preserves weak pullbacks. The two 

evaluation functions listed below are well-behaved and yield the given Wasserstein 
pseudometric on X'^ for any pseudometric space (X, d). 


evF(s) 

d'*'''(si,S 2 ) 

maxs(al 

maxdfsi (a),S7(a)) 

aeA 

aeA ^ ^ 

L s(a) 

Y_ d(si (a],S2(a)) 

aeA 

aeA 


By composing this functor with the product bifunctor we obtain the machine bifunctor 
which we will use to obtain trace semantics. 

Example 3.5 (Machine Bifunctor). Let A be a finite set of inputs, 1 = the input functor 
of Example 3.4, Id the identity endofunctor on Set and P be the product bifunctor 
of Example 2.10. The machine bifunctor is the composition M := P o (Id x I] i.e. the 
bifunctor M: Set^ —)■ Set with M(B,X) := B x X'^. Since for Id and I there are unique 
(thus optimal) couplings we have compositionality. Depending on the choices of 
evaluation function for P and I (for Id we always take id[04]) we obtain the following 
well-behaved evaluation functions ev^: [0,1] x [0,1]'^ —)> [0,1]. 


Parameters 

evp(ri,r2) 

evi(s) 

evM(o,s] 

Ci,C2 G ( 0 , 1 ] 

max{ciri,C2r2} 

maxs(a) 

aeA 

max-^ cio, C 2 maxs(a) > 

L aeA J 

Ci,C2 G ( 0 , 1 ], 
Cl -F C2 ^ 1 

CiXi -F C 2 X 2 

lAr'' £ s(a) 

aeA 

Cio-FC 2 A|“^ Y_ s(a). 

aeA 


Let (B,Bb), (X,d] be pseudometric spaces. Eor any ti,t2 G /V 1 (B,X) with t| = 
(bi,Si) G B X X'^ there is a unique and therefore necessarily optimal coupling t := 
(bi,b2, (si,S2))- Depending on the evaluation function, we obtain for the first case 

(dB,d)'*''^(ti,t2) =max<^ Ci dB(bi,b2j,C2 • max d(si (a), S2(a)) 

( qca 

and for the second case 

(dB,d)'*''^(ti,t2) = cidB(bi,b2) -FC2|Ar'' Y_ (<i)/S2(a)) . 

qGA 
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4 - Lifting of Natural Transformations and Monads 


Usually we will fix the first argument (the set of outputs) of the machine bifunctor and 
consider the obtained machine endofunctor Mb := M(B,_). However, for the same 
reasons as explained above for the product bifunctor, we need to consider it as bifunctor. 
One notable exception is the case where B = 2 , endowed with the discrete metric. Then 
we have the following result. 

Example 3.6. Consider the machine endofunctor M2 := M( 2 ,_) = 2 x with evalua¬ 
tion function evM2: 2 x [ 0 , 1 ]'^, (o, s) 1-^ c • evi(s) where c G ( 0 , 1 ] and evi is one of the 
evaluation functions for the input fimctor from Example 3.4. If d2 is the discrete metric 
on 2 and c = C2 (where C2 is the parameter for the evaluation function of the machine 
bifunctor as in Example 3.5) then the pseudometric obtained via the bifunctor lifting 
coincides with the one obtained by endofunctor lifting i.e. for all pseudometric spaces 
(X, d) we have (d2, d) = d'*'^^. Moreover, although couplings for M2 do not always 
exist we have = (d'l'^^) 


4 . Lifting of Natural Transformations and Monads 

Recall that a monad on an arbitrary category C is a triple (T,r|, p) where T: C —> C is 
an endofunctor and rj: Id T, p: T^ => T are natural transformations called unit (p) 
and multiplication (p) such that the two diagrams below commute. 



If we have a monad on Set, we can of course use our framework to lift the endofunctor 
T to a functor T on pseudometric spaces. A natural question that arises is, whether we 
also obtain a monad on pseudometric spaces, i.e. if the components of the unit and 
the multiplication are nonexpansive with respect to the lifted pseudometrics. In order 
to answer this question, we first take a closer look at sufficient conditions for lifting 
natural transformations. 

Proposition 4.1 (Lifting of a Natural Transformation). Let F, G be endofunctors on Set with 
evaluation functions evf, evc and A: L ^ C be a natural transformation. Then the following 
holds for all pseudometric spaces (X, d). For the Kantorovich lifting: 

1. Ifevc o A[o,-r] ^ evf then d'^^ o (Ax x Ax) ^ d'^'", i.e. Ax is nonexpansive. 

2. If evG o ^[o,T] = svp then d"^^ o (Ax x Ax) = d"^'", i.e. Ax is an isometry, 
while for the Wasserstein lifting 

3. IfevQ o A[o -r] ^ evf then d'*'^ o (Ax x Ax) ^ d-*''", i.e. Ax is nonexpansive. 
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4. If evG o A[o,t] = svf and the Kantorovich Rubinstein duality holds for F, i.e. = d'*''", 
then d'*'^ o (Ax x Ax) = d'*'*", i.e. Ax is an isometry. 

In the rest of the paper we will call a natural transformation A nonexpansive [an 
isometry] if (and only if) each of its components are nonexpansive [isometries] and 
write A for the resulting natural transformation from F to G. Instead of checking 
nonexpansiveness separately for each component of a natural transformation, we can 
just check the above (in-)equalities involving the two evaluation functions. 

By applying these conditions on the unit and multiplication of a given monad, we 
can now provide sufficient criteria for a monad lifting. 

Corollary 4.2 (Lifting of a Monad). Let (T,ri, p) be a Set-monad and evj an evaluation 
function for T. Then the following holds. 

1. If evj ori[o^T] ^ id[o,T] Ihen ri is nonexpansive for both liftings. Hence we obtain the unit 

f[: Id T in PMet. 

2. If evj ori[o T] = id[o,T] h an isometry for both liftings. 

3. Let d^ G d'*'^}. If evj o p[o,T] ^ ^vj o Tevj and compositionality holds for TT, 
i.e. (d^)^ = d^^, then p is nonexpansive, i.e. d^ o (px x px) ^ (d^)^. This yields the 
multiplication p: TT T in PMet. 

We conclude this section with two examples of liftable monads. 

Example 4.3 (Finite Powerset Monad). The finite powerset functor of Example 2.8 
can be seen as a monad, with unit p consisting of the functions px: X —> TfinX, 
hx(x) = M and multiplication given by px: T’finJ’fm.X —> TfinX, px(S) = US. We check 
if our conditions for the Wasserstein lifting are satisfied. Given r G [ 0 , c»] we have 
evT op[0,oo](T') = max{r} = r and for S G T’firL( 3 ’fi.n[ 0 , T]) we have evj o p[o,i]{S) = 
maxUS = maxUsesS and evj o TevjjS) = maxjevjjS]) = max{maxS | S G §} and 
thus both values coincide. Moreover, we recall from Example 3.3.2. that we have 
compositionality for Tfin^’fin- Therefore, by Corollary 4.2 p is an isometry and p 
nonexpansive. 

Example 4.4 (Distribution Monad). The probability distribution functor V of Example 2.9 
can be seen as a monad: the unit p consists of the functions px: X —DX, px(x) = 5 ^ 
where 6^ is the Dirac distribution and the multiplication is given by px: T)T>X —)> DX, 
px(P) = Ax. ^q£X)X consider its Wasserstein lifting. Since [ 0 , 1 ] = D 2 

we can see that evx) = Pi- Using this and the monad laws we have ev^) op[o,i] = 
P2 op 232 = idux = id[o,i] and also ev® o p[o,i] = P2 o PD2 = 2)p2 = ev® o Dev^). 

Moreover, since we always have optimal couplings, we have compositionality for DD 
by Proposition 3.2. Thus by Corollary 4.2 p is an isometry and p nonexpansive. 
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5 . Trace Metrics in Eilenberg-Moore 

As mentioned in the introduction, trace semantics can be characterized by means of 
coalgebras either over Kleisli [PT99, HJS07] or over Eilenberg-Moore [SBBR13, JSS15] 
categories. We focus on the latter approach. We first recall the basic notions of Eilenberg- 
Moore algebras and distributive laws, and discuss how the results in the paper can be 
used to "lift" the associated determinization construction. This is then applied to derive 
trace metrics for nondeterministic automata and probabilistic automata, by relying on 
suitable liftings of the machine functor. 


5.1. Generalized Powerset Construction 

An Eilenberg-Moore algebra for a monad (T,r|, p) is a C-arrow a: TA —i A making the 
left and middle diagram below commute. Given two such algebras a: TA —i A and 
b: TB —I B, a morphism from a to b is a C arrow f: A —i B making the right diagram 
below commute. 


A 



T^A 

Tci 


TA 


Ea 


a 


TA 


A 


TA 
A - 


Tf 


TB 

ib 

► B 


Eilenberg-Moore algebras and their morphisms form a category denoted by £M(T]. 
A functor F: £M(T) —)• £M(T) is called a lifting of F: C —i C to £M(T) if U^F = FU^, 
with : £M(T) —i C the forgetful functor. A natural transformation A: TF FT is an 
EM-law (also called distributive law) if it satisfies: 


F 

TF 



T^F 
Pbi 
TF - 


TA 


TFT 


AT 


FT2 

jbp 

- FT 


Liftings and £M-laws are related by the following folklore result (see e.g. [JSS12]). 

Proposition 5.1. There is a bijective correspondence between EJA-laws and liftings to £M- 
categories. 


£M-laws and liftings are crucial to characterize trace semantics via coalgebras. Given 
a coalgebra c: X —)• FTX, for a functor F and a monad (T,ri, p) such that there is a 
distributive law A: TF FT, one can build an F-coalgebra as 


c# := TX 


Tc 


TFTX 


Ajx 


FTTX 


bpx 


FTX) 
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If there exists a final F-coalgebra cu: O —i FO, one can define a semantic map for tFie 
FT-coalgebra c into O. First let [ — ]: TX —i O be the unique coalgebra morphism from 
cK Then take the map [ — ]or|:X—lO. 



One can readily check that c* is an algebra map from the T-algebra px to kpx/ namely it 
is an F-coalgebra or, equivalently, a 'K-bialgebra [TP97, Klin], Similarly for w, Q. carries 
a T-algebra structure obtained hj^ finality and hence the final F-coalgebra cu can be 
lifted in order to obtain the final F-coalgebra (see [JSS12, Prop. 4]). 

This result holds for arbitrary categories and, in particular, we can reuse it for our 
setting: we only need an £M-law on PMet. Note that Proposition 4.1 not only provides 
sufficient conditions for monad liftings but also can be exploited to lift £M-laws. 
Indeed the additional commutativity requirements for £M-laws trivially hold when all 
components are nonexpansive. 

Corollary 5.2 (Lifting of an LM-laiv). Let F, G be weak pullback preserving endofunctors on 
Set with well-behaved evaluation functions evp, evc and A: FG GF be an LM-law. If the 
evaluation functions satisfy evQ o Gevp o A[o^t] ^ o Fevc and compositionality holds for 
FG, then A is nonexpansive and hence A: F G G F is also an LM-law. 

We will now consider £M-laws for nondeterministic and probabilistic automata. In 
the first case, T is the powerset monad Tftn and F is the machine functor M2 = 2 x 
while in the second case T is the distribution monad D and F is the machine functor 
M[o,i] = [ 0 / 1 ] ^ Note however that while in the first case Corollary 5.2 is directly 
applicable, this is not true in the second case, since we need to deal with multifunctors. 

Example 5.3 (LM-lawfor Nondeterministic Automata). Let (Tfin/h/ h) be the finite power- 
set monad from Example 4.3. The £M-law A: T’fiTL (2 x => 2 x is defined, 

for any set X, as 


Ax(S] = (o,Aa G A. {s'(a) 


(o',s') G S}), 


where o = 



3 s' G X''^.(l,s') G S 
else 


This is exactly the one exploited for the standard powerset construction from automata 
theory [SBBR13]. Indeed, for a nondeterministic automaton c: X —i 2 X the 

map [ —1 oqx assigns to each state its accepted language. Corollary 5.2 ensures that it 
is nonexpansive (see Appendix P for a detailed proof). 
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Example 5.4 (8M-lazvfor Probabilistic Automata). Eet |l) be the distribution monad 
from Example 4.4 and M be the machine bifunctor from Example 3.5. There is a 
known [SBBR13] £M-law A: !D([ 0 , 1 ] x [ 0 , 1 ] x given by the assignment 



Ax(P) 


Also this £M-law is nonexpansive, as shown in Appendix P. 

Any FT-coalgebra c: X —i FTX can always be regarded as an F T-coalgebra by equipping 
X with the discrete metric assigning T to non equal states (in this way, c is trivially 
nonexpansive). The consequence of the nonexpansiveness of the £M-laws A is the 
following: the "generalized determinization" procedure for nondeterministic and 

probabilistic automata can now be lifted to pass from F T-coalgebras to F-coalgebras 
in £M(T) by using the upper adjunction in the diagram below (analogously to [JSS12, 


JSS15]). 



Since we can also lift the final F-coalgebra to £M(Tj, we can use it to define trace 
distance. This procedure is detailed in the next section. 

5.2. Final Coalgebra for the Lifted Machine Functor 

If we fix the first component of the machine bifunctor M on Set we obtain an endo- 
functor Mb : Set —i Set, Mb(X) = B x It is known [MA86] that the final coalgebra 
for this functor is k: B'^* — i B x (B'^*)'^ with K(t) = (t(£),Aa G A.Aw G A*.t(aw)). We 
employ an analogous construction with our lifted machine bifunctor M on PMet, i.e. 
we fix a pseudometric space (B, de) of outputs and consider coalgebras of the functor 
Ad(B,dB) AT((B, dB),_). To obtain the final coalgebra for this functor in PMet, we use 
the following result from [BBKK14]. 

Proposition 5.5 ([BBKK14, Thm. 6.1]). Let F: PMet —> PMet be a lifting of a functor 
F: Set —> Set which has a final coalgebra k: Cl FD. For every ordinal i we construct 
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a pseudometric di: O x O —> [ 0 , T] as follows: do := 0 is the zero pseudometric, di+i := 
d[ o (k X k) for all ordinals i and dj = di for all limit ordinals j. This sequence 

converges for some ordinal 0 , i.e de = dg o (k x k). Moreover k: {D., de) -V (FO, dg) is the 
final f-coalgebra. 

It is hence enough to do fixed-point iteration for the functor F on the determinized 
state set TX in order to obtain trace distance. The lifted monad is ignored at this stage, 
but its lifting is of course necessary to establish the Eilenberg-Moore category and its 
adjunction. 

We now consider our two example cases, where in both cases F is the machine functor 
Mb (for two different choices of B): 

Example 5.6 (Final Coalgebra Pseudometric). Let M be the machine bifunctor. 

1. We start with nondeter minis tic automata where the output set is B = 2 and we 
use the discrete metric dz as distance on 2 as in Example 3.6. As maximal distance 
we take T = 1 and as evaluation function we use evM(o, s] = c • max^gA s(a) for 
0 < c < 1. 

Eor any pseudometric d on 2 ^^ - the carrier of the final M2-coalgebra - we know 
that for elements (oi, si)/ (02, S2) G 2 x ( 2 ^^* we have the Wasserstein pseudomet¬ 
ric d'*''"((oi,Si), (02, S2)) = max {d2(oi,02),c • maXagA d(si (a),S2(a])}. Thus the 
fixed-point equation from Proposition 5.5 is, for Li,L2 G 2 ^^ , 

d(Li,L2) = max |d2(Li (e),L2(£)),c • ma^d(Aw.Li (aw), Aw.L2(aw)) 

Now because d2 is the discrete metric with d2(0 ,1 ] = 1 we see that d2A* as defined 
below is indeed the least fixed-point of this equation and thus (2'^*, d2A*) is the 
carrier of the final M2-coalgebra. 

d2A*:2^‘ x2'^* ^ [0,1], d2A*(Li,L2) 

A determinized coalgebra has as carrier set sets of states TjX). Each of these sets is 
mapped to the language that it accepts and the distance between two languages 
Li, L2: A* —)■ 2 can be determined by looking for a word w of minimal length which 
is contained in one and not in the other. Then, the distance is computed as 
This corresponds to the standard ultrametric on words. 

2. Next we consider probabilistic automata where B = [ 0 , 1 ] equipped with the stan¬ 
dard Euclidean metric de. 

Eurthermore the remaining parameters are set as follows: let T = 1 and the 
evaluation function is evM(o,s) = Ci o -F C2|AP^ Hqga Ci,C2 G ( 0 , 1 ) such 

that Cl -F C2 P 1 as in Example 3.5. This time, the machine functor must be lifted 
as a bifunctor in order to obtain the appropriate distance (cf. the discussion before 
Example 2.10). 
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For any pseudometric d on [ 0 , 1 ]'^* we know that for (ri,si], (r2, si) G [ 0 , 1 ] x 
([0,1]^*)^ we have d'l'''((ri,si), (r2,S2)) = Ci|ri -r2|+d(si (a),S2(a)). 
Thus the fixed-point equation from Proposition 5.5 is, for pi,p2 £ [ 0 , 1 ]"^*: 

d(pi,P2) = Cl Ipi (e) -p2(£)l + Y - d(Aw.pi (aw),Aw.p2(aw)^ 

It is again easy to see that djQ pA*: [ 0 , 1 ]"^* x [ 0 , 1 ]^* —)> [ 0 , 1 ] as presented below is 
the least fixed-point of this equation and therefore ([0,1]'^*, djg pA*) the carrier of 
the final M([oj],de)'Coalgebra. 

d[ 0 ,l]A*(PuP 2 ) = Cl • Y_ (jXj) Ipi (w)-P 2 (w]| . 

Here, a determinized coalgebra has as carrier distributions on states D(X). Each 
such distribution is mapped to a function p: A* —)> [ 0,1 ] assigning numerical values 
to words. Then the distance, which can be thought of as a form of total variation 
distance with discount, is computed by the above formula. 

If instead of working in the interval [ 0,1 ] we use [ 0 ,00] with T = 00, we can drop 
the conditions Ci,C2 < 1 and Ci -I- C2 ^ 1 . In this case we may set C2 := |A| and 
Cl := 1/2 and then the above distance is equal to the total variation distance, i.e., 

^i[0,oo]A*(Pl/P2) = ^ • Y- IPl “P2(w)| . 
weA* 

6 . Conclusion, Related and Future Work 

In the last years, an impressive amount of papers has studied behavioral distances 
for both probabilistic and nondeterministic systems (see, e.g., [GJS90, DGJP04, VBW05, 
BBLM15, dAFSoq, dAFSoq, FLTii]). The necessity of a general understanding of 
such metrics is not a mere intellectual whim but it is perceived also by researchers 
exploiting distances for differential privacy and quantitative information flow (see for 
instance [GGPX14]). As far as we know, the first use of coalgebras for this purpose 
dates back to [VBW05], where the authors consider systems and distance for a fixed 
endofunctor on PMet. In [BBKK14], we introduced the Kantorovich and Wasserstein 
approaches as a general way to define "canonical liftings" to PMet and behavioral 
distances by finality These are usually branching-time, while many properties of 
interest for applications (see again [GGPX14]) are usually expressed by means of 
distances on set of traces. In this paper, we have shown that the work developed 
in [BBKK14] can be fruitfully combined with [JSS15] to obtain various trace distances. 

Among the several trace distances introduced in literature, it is worth to men¬ 
tion [BBLM15, dAFSo4, dAFSoq, FLTii]. Similar to the trace distance we obtain in 
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Example 5.6 for probabilistic automata is the one introduced in [BBLM15] for Semi- 
Markov chains with residence time. In [dAFSo4, dAFSog], both branching-time and 
linear-time distances are introduced for metric transition systems, namely Kripke struc¬ 
tures where states are associated with elements of a fixed (pseudo-)metric space M, 
that would correspond to coalgebras of the form X —)• M x IP(X). In [BBKK14], we have 
shown an example capturing branching-time distance for metric transition systems, but 
for linear distances we require a distributive law of the form T(M x _) M x ?*(_), 
for which we would need at least M carrying an algebra for the monad IP. We also 
plan to investigate trace metrics in a Kleisli setting [HJS07], where it might be easier to 
incorporate such examples. 

There are two other direct consequences of our work that we did not explain in the 
main text, but that are important properties of the distances that we obtain (and, indeed, 
are mentioned in [CGPX14] amongst the desiderata for "good" metrics). First, the 
behavioral branching-distance for F T provides an upper bound to the linear-distance F, 
analogously to the well-known fact that bisimilarity implies trace equivalence. To see 
this, it is enough to observe that there is a functor from the category of F T-coalgebras 
to the one of F-coalgebras mapping c: X —)• F TX into c**: TX —)• F TX. 

Second, since the final map [ — 1 is a morphism in £M(T), the behavioral distance 
for F is nonexpansive w.r.t. the operators of the monad T. Nonexpansiveness with 
respect to some operators is a desirable property which has been studied, for instance 
in [DGJP04], as a generalization of the notion of being a congruence for behavioral 
equivalence. Several researchers are now studying syntactic rule formats ensuring this 
and other sorts of compositionality (see e.g. [GT13] and the references therein) and we 
believe that our Gorollary 5.2 may provide some helpful insights. 

In this perspective, however, our results are still unsatisfactory if compared to what 
happens in the case of behavioral equivalences. From a fibrational point of view, one has 
a canonical lifting to Rel (the category of relations and relation preserving morphisms) 
such that compositionality holds on the nose and distributive laws always lift [Jaci2, 
Exercise 4.4.6]. The forgetful functor U: PMet —> Set is also a fibration [BBKK14], but 
Kantorovich and Wasserstein liftings are not always so well-behaved. Fibrations might 
be useful also to guarantee soundness of up-to techniques [BPPR14] for behavioral 
distances that, hopefully, will lead to more efficient proofs and algorithms. 

Another interesting future work would be to show that Kantorovich and Wasserstein 
liftings arise from some universal properties, i.e., that they are the smallest and largest 
metric in some continuum of metrics with certain properties. Here we would like 
to draw inspiration from [VB05] which characterizes the Giry monad via a universal 
property on monad morphisms. 

Finally, we would like to have an abstract understanding of the Kantorovich-Rubin- 
stein duality. Preliminary attempts suggest that this is very difficult: indeed the proof 
for the probabilistic case relies on specific properties of distributions. 
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P. Proofs 

Here we provide proofs for the soundness of our definitions (where needed), the 
stated theorems, propositions, lemmas, examples and also for all claims made in the 
in-between texts. If a theorem environment starts with the symbol O it has been stated 
in the main text and is repeated here for convenience of the reader (using the numbering 
from the main text). Otherwise it is a new statement which clarifies/justifies claims 
made in the main text and its number starts with P. 

P.2 . Preliminaries 

For the upcoming proofs we will often use the following, alternative characterization of 
W3. 

Lemma P.2.1 (Weak Pullback Characterization ofW^). Let F 
be an endofunctor on Set with evaluation function evp and 
i: { 0 } [ 0 , T] be the inclusion function. For any set X we 

denote the unique arrow into [ 0 ] by !x: X —^ { 0 }. Then evp 
satisfies evj^ [{ 0 }] = Fi[F{ 0 }] if and only if the diagram on the 
right is a weak pullback. 

Proof. Commutativity of the diagram is equivalent to [{ 0 }] 5 Fi[F{ 0 }]. Given a set 
X and a function f: X —F[ 0 ,T] as depicted below, we conclude again by commutativity 
(io!x = evp o f) that f(x] G ev^^ [{ 0 }] for all x G X. 



h{0} 

F{0} -^{0} 


Fi 


evf ' 

F[ 0 ,T]-- [ 0 ,T] 


Now if evp ^ [{ 0 }] C Fi[F { 0 }] then for f (x) G evp ^ [{ 0 }] we can choose a (not necessarily 
unique) xq G F{ 0 } such that f(x) = Fi(xo). If we define (p: X —^ F{ 0 }by (p(x) = xq then 
clearly cp makes the above diagram commute and thus we have a weak pullback. 

Conversely if the diagram is a weak pullback consider the set X = [{ 0 }] and 

the function f: ev^M{ 0 }] F[ 0 , T],f(x) = x. Now for any x G eVpM{ 0 }] we have 

Fi((p(x)) = (Fio (p](x) = f(x) = X, hence since (p(x) G F{ 0 } we have x G Fi[F{ 0 }]. □ 

O Example 2.9 (Finitely Supported Distributions). The probability distribution fimctor 
D assigns to each set X the set DX = {P: X —[ 0 , 1 ] | |supp(P)| < 00,P(X) = 1 } and to 
each function f: X —^ Y the fimction Df: DX —DY, Df(P)(y) = Hxef-'[{y}] = 

P(f“^ [{y}]). D preserves weak pullbacks and the evaluation function ev^): D[ 0 , 1 ] —)• 
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[0,1], evx)(P) = HT-e[o i] ^well-behaved. For any pseudometric space (X, d) we 
obtain the Wasserstein pseudometric which, for any Pi, P2 £ 2)X, is defined as 

d'‘'®(Pi,P2) = min < Y_ d(xi,X2) • P(xi,X2) 
l^xi,x2ex 

The Kantorovich-Rubinstein duality [Vilog] holds from classical results in transportation 
theory 

Proof. Weak pullback preservation, well-behavedness and the duality was already 
presented in [BBKK14]. Here we just quickly check that indeed the infimum is a 
minimum: Let suppjPi) U supp(P2) = {si,..., Sn} be the union of the finite supports 
of Pi and P2. Then define the following finitely many real numbers pu := Pi(si), 
V2j ■= Pijsj), dij := d(si,Sj]. Then the distance of Pi and P2 can be equivalently 
expressed as the following LP: 

minimize dij • Xij 

1 

subject to ^ Xij = p 1 i, 1 ^ i ^ n 
1 

y Xij =p2j, 1 ^ j ^ n 

1 

0 ^ Xij ^ 1, 1 ^ i,j ^ n 

whose feasible region is nonempty (xij := pi i • P2j is in it) and bounded. Thus we indeed 
get an optimal solution xT and can define the optimal coupling as P* (si, Sj ] := xT. □ 

O Example 2.10 (Product Bifunctor). The weak pullback preserving product bifunctor 
F: Set^ —> Set maps two sets Xi,X2 to F(Xi,X2] = Xi x X2 and two functions fii Xi —^ 
Yi to the function F(fi,f2) = fi x f2. In this paper we will use the well-behaved 
evaluation functions evp: [0,1]^ —)■ [0,1] presented in the table below. Therein we also 
list the pseudometric (di,d2]'": Xi x X2 —^ [0,T] we obtain for pseudometric spaces 
(Xi,di), (X2,d2). 


Parameters 

evF(ri,r2) 

(di,d2)''((xi,X2],(yi,y2)) 

Ci,C2 G (0,1] 

Ci,C2 G (0,1],Cl -FC2 ^ 1 

max{ciri,C2r2} 

cixi -F C 2 X 2 

max{cidi (xi,-yi],C 2 d 2 (x 2 ,y 2 )} 
Cidl (Xi,^!)-FC2d2(X2,y2] 


For Cl = C2 = 1, the first evaluation map yields exactly the categorical product in PMet. 
In both cases the Kantorovich-Rubinstein duality holds and the supremum [infimum] 
of the Kantorovich [Wasserstein] pseudometric is always a maximum [minimum]. 

Proof. We adapt the proof given in [BBKK14, Exa. 5.1] to also include the discounted 
maximum (all other cases were covered there). First we show well-behavedness. 
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1. Let fi, gi: Xt —^ [ 0 , T] with fi ^ gt be given. Then we have 

=max(cifi,C2f2) ^ max(cigi,C2g2) =^(9092) ■ 

2. Let t := (xii,'X2i,xi2,X22) G 1 ^([ 0 ,T]^, [ 0 ,T]^] = [ 0 ,T]^ x [ 0 ,T]^. We have to 
show the inequality de(F(7Ti,7ti )(t),F(7t2,7t2)(t)) ^ F(de, d,e)(t). We observe that 
F(de, de)(t) = evF(de(xii,X2i), de(xi2,X22)) and if we define Zi = evF(xii,Xi2) = 
max{cixn,02X12} then de(F(7Ti,7ti )(t),F(7t2,7r2)(t)) = de(zi,Z2). We thus have to 
show the inequality 

de {Z^,Z2] ^ eVF(de(xn,X2l),de(xi2,X22)) • (l) 

If z^ = Z2 this is obviously true because de(zi,Z2) = 0 and the rhs is non-negative. 
We now assume zi > Z2 (the other case is symmetrical). For 00 = zi > Z2 the 
inequality holds because then xn = cxd or X12 = 00 and X2i,X22 < cxd (otherwise 
we would have Z2 = 00) so both Ihs and rhs are 00. Thus we can now restrict to 
00 > z-\ > Z2 where necessarily also xii,xi2,X2i,X22 < cxo (otherwise we would 
have zi = 00 or Z2 = 00). According to [BBKK14, Lemma P2.1], the inequality (1) is 
equivalent to showing the two inequalities 

Zi ^ Z 2 + evF(de(xii,X 2 i),de(xi 2 ,X 22 )), and 
Z2 < Zi +eVF(de(xil,X 2 l),de(xi 2 ,X 22 )) • 

By our assumption (cxd > zi > Z2) the second of these inequalities is satisfied, so we 
just have to show the first. If zi = Cixn we have 

Z2 -F max{ci dg (xi 1,X2i), C2de (xi 2, X22)} ^ Z2 + Ci dg (xi 1,X21) = Z2 + ci |xi 1 - X21 1 

^ Z2-Fci(xn -X21) =Z2 + Cixii -C1X21 
= Z2 -Fzi -C1X21 = Zi -F (Z 2 -C1X21) ^ Zi 

because Z2 = max{ciX2i,C2X22} > C1X21 and therefore (z2 — C1X21) ^ 0 . The same 
line of argument can be applied if zi = C2Xi2. 

3. F(i,f][F({ 0 },{ 0 }}] = (i X i)[{ 0 } X { 0 }] = {( 0 , 0 )} and ev"' [{ 0 }] = {( 0 , 0 )}. 

We now prove that the Kantorovich-Rubinstem duality holds and simultaneously 
that the supremum (in the Kantorovich pseudometric) is a maximum and the in- 
fimum (of the Wasserstem pseudometric) is a minimum. Let (Xi,di), (Xi,d2) be 
pseudometric spaces and let ti = (xii,Xi2) G F(Xi,X2) = Xi x X2 be given. Their 
unique coupling is t := ((xn,X2i), (xi2,X2)) G rF(ti,t2) and we have F(di,d2)(t) = 
max{ci di (xi,y 1), C2d2(x2,y2)}- We define ft := di(xu,_), which are nonexpansive 
due to [BBKK14, Lemma 2.3]. Then we clearly have ft(xu) = 0 and moreover 

de(F(fl,f2)(tl),F(fi,f2)(t2)) = dg (eVF(fl (xn),f2(xi2)),eVF(fl (X 21 ),f 2 (X 22 ))) 

= dg( 0 ,max{cidi (xn,X2i), C2d2(xi2,X22)}) 

= max{ci di (xii,X2i ),C2d2(xi2,X22)} = F(di,d2)(t) 
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Due to [BBKK14, Proposition P.5.7] we can now conclude that duality holds and both 
supremum and infimum are attained and equal to the above maximum. □ 

P. 3 . COMPOSmONALITY FOR THE WaSSERSTEIN LIFTING 
P.3.1. COMPOSITIONALITY FOR EnDOFUNCTORS 

We first collect a few simple observations that we will use in the upcoming proofs. 

Lemma P.3.1. Let F, G be endofunctors on Set with evaluation functions evf,evQ and 
a := (Gtti, G7r2) (i.e. the unique mediating arrow into the product) and (X, d) an arbitrary 
pseudometric space. Then the following holds. 

1. Gd ^ d'*'^ o a ^ d"!^^ o a 

2. Vti,t2GFGX: tGrFG(ti,t2) ^ Fa(t) G rF(ti,t2). 

3. J/F and G preserve weak pullbacks then so does FG. 

4. For any f G Set/[ 0 , T] we have FGf = F(Gf). 

Proof. We first of all observe that a is the unique mediating arrow into the product as 
indicated in the following diagram. 



1. Let s G G(X X X) and define Si := G7r^(s] = o a(s). Then by definition s G 
rG(si,S 2 ) and we conclude Gd(s) ^ inf{Gd(s'] | s' G rG(si,S 2 )} = d'*'^(si,S 2 ] = 
dFej-^GX Q a(s),7T2 ^ o a(s)) = d'*'^ o a(s). Since we always have d'*'^ ^ d"^^ as 
shown in [BBKK14], the statement follows. 

2. F7r?^(Fa(t)) = F(7 tP^ o a)(t) = F(G7r^)(t) = FGtt^ = L. 

3. This is indeed clear by definition. 

4. Let f: X —^ [ 0 , T], then FGf = evFG ° FGf = cvf o Fcvg o FGf = evF o F(evG o Gf) = 

F(Gf). □ 

Lemma P.3.2. Let F,G be functors with evaluation functions cvf and evG and define evFG := 

evF o FevG. Then the following holds. 

1. f/F and G are monotone (Condition Wi), then so is FG. 

2. If G preserves weak pullbacks, evG is well-behaved and F is monotone then cvfg satisfies 
Condition Wz of welTbehavedness. 
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3- If V preserves weak pullbacks and evp, evc satisfy Condition W3 of well-behavedness, then 
also evpG satisfies Condition W3 of well-behavedness. 

Proof. 1. Let f,g: X —> [ 0 ,T] with f ^ g, then by monotonicity of evc we have 
Gf ^ Gg and using monotonicity of evp we get FGf = F(Gf) ^ F(Gg] = FGg. 

2. Let t G FG([ 0 ,T]^) and define tp := FG7Ti(t) G FG[ 0 ,T]. By definition t G Fpclti,!!) 
so Lemma P.3.1 tells us Fa(t) G rp(ti,t2) for a := (G7Ti,G7t2). Moreover, since 
evG : (G[ 0 , T], d|^) -V ([ 0 , T], de) is nonexpansive (by definition of the Kantorovich 
pseudometric), we can apply [BBKK14, Prop. P.4.2] to obtain the inequality 

de(evFG(ti)/evpG(t2)) = de(FevG(ti),FevG(t2)) ^ Fd|^(Fa(t]) = F(dJ'^ o a)(t). 

By Lemma P.3.1 we have d|^ o a ^ Gdg and using monotonicity of F we can 
continue our inequality with F(dl^ o a)(t] ^ F(Gde)(t] = FGde(t) which concludes 
the proof! 

3. Using Lemma P.2.1 we just have to show that the following diagram is a weak 
pullback. 


!fg{0} 



evpG 


Lemma P.2.1 tells us that the right square is a weak pullback and since F preserves 
weak pullbacks also the left square is. The outer part is necessarily a weak pullback 
again yielding by Lemma P.2.1 that evpG satisfies the third condition. □ 

O Proposition 3.1 (Well-Behavedness of Composed Evaluation Function). Let F, G be 
endofunctors on Set with evaluation functions evp, evG- If both functors preserve weak 
pullbacks and both evaluation functions are well-behaved then also evpG = ^vp o Fcvg is 
well-behaved. 

Proof. This is an immediate corollary of Lemma P.3.2. □ 

To prove our compositionality criteria, we use the following results. 

Lemma P.3.3. Let F, G be endofunctors on Set with evaluation functions evp: F[ 0 , T] — )• [ 0 , T], 
evG : G[ 0 ,T] —^ [ 0 ,T]. We define evpG := evpoFevG- Then the following holds for every 
pseudometric space (X, d). 


25 









P. Proofs 


1. ^ (dtG)tF, 

2. If y and G preserve weak pullbacks and evf, evG are well-behaved then ^ (d'*'^)'*''". 

3. If for all t),t2 G FGX there is a function rF(ti,t2) — rFG('ti,t2) such that 

FGdo V(ti,t2) =FdFG then d^'^G ^ (dFG)FF, 

Proof. Let ti,t2 G FGX. 

1. Recall tFiat d^^ is tFie smallest pseudometric sucFi tFiat for every nonexpansive 
function f: (X, d)-V ([ 0 , T], dg] also Gf: (GX, d^)-!> ([ 0 , T], dg) is nonexpansive 
(see remark after [BBKK14, Def. 3.1]). Moreover, FGf = F(Gf) by Lemma P.3.1. Thus 

d^^'^(ti,t2) =sup{dg(ref(ti),l^f(t2)) If: (X,d)-V([ 0 ,T],dg)} 

= sup {dg(F(Gf](ti ),F(Gf](t2)) I f: (X,d] ([ 0 ,T],dg)} 

^ sup|de(F(g)(ti],F(g](t2]) | g: (GX, d'^^)-V ([ 0 ,T], dg)| 

= (dtG]t^(ti,t2) 

2. Lemma P.3.1 tells us Gd ^ dFG o a and for any coupling t G rFG(tot2] we have 
Fa(t) G rF(ti,t2]. Using these facts and the monotonicity of F we obtain: 

d'>'''‘"(ti,t2] = inf |FGd(t) | t G FFoltiUa)} = inf |F(Gd](t) 1 1 G rFG(ti,t2)} 

^ inf |F(d'*'^ o a)(t] I t G rFG(ti,t2)} 

= inf|Fd'>'‘"(Fa(t)) | t G rFG(ti,t2]} 

^ inf {Fd^'^lt') I t' G rF(ti,t2)} = (d^G)^^(ti,t2] 

3. Using V(ti,t2) we compute 

d'*'’'^(tl,t2) = inf |FGd(t') | t' G rFG(ti,t2)} 

^ inf |FGd(V(ti,t2](t)) | t G rF(ti,t2)| 

= inf{Fd^G(t) |tGrF(ti,t2)} = (d^G)^^(ti,t2). □ 

With this result at hand we can prove 

O Proposition 3.2 (Compositionality). Let F, G be weak pullback preserving endofunctors on 
Set with well-behaved evaluation functions evF, cvg and let (X, d) be a pseudometric space. 
Then d^*"^ ^ (dF^)U, Moreover, if for all ti,t2 G GX there is an optimal G-coupling, i.e. 
y(ti/t2) G rG(ti,t2) such that d'*'^(ti,t2) = Gd(y(ti,t2)), then = (d'*'^ 

Proof. From Lemma P.3.3.2, we know dF'^^ ^ (dF^)F^. By our requirement we have 
a function y: GX x GX — > G(X x X], such that d^^ = Gdoy. Given ti,t2 G FGX 
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and t G rF(ti,t2), we define V(ti,t2)(t) = Fy(t), then this satisfies the conditions of 
Lemma P.3.3.3. . First, we have Fy(t) £ rFclti/ti) because FG7r^(Fy(t)] = F(G7t^ o 
y)(t] = F7T?^(t) = ti. Moreover 

FGd(Fy(t)) = evpG o F(Gdoy(t)) = evp o Fevc o F(Gdoy)(t) 

= evf o F(Gdoy) (t) = evf o Fd'*'^ (t) = Fd'*'^ (t). □ 

O Example 3.3. We consider the finite powerset functor ‘ J ’ fm , of Example 2.8 and the 
distribution functor T) of Example 2.9 with their evaluation functions. Let (X, d) be a 
pseudometric space. 

1. We have d'*'®® = (d'*'®) by Proposition 3.2, because optimal couplings always 
exist. 

2. We have although CFftn-couplings do not always exist. 


Proof. We just have to prove the second claim. We already know from Lemma P.3.3.2, 
that 





(2) 


holds. We now show that we always have equality. Let (X, d] be a pseudometric space 
and Ti,T 2 G J’finJ’finX- We distinguish three cases: 

Case 1: If Ti = T2 = 0 we know by reflexivity that both values are 0 . 

Case 2: If Ti = 0 / T2 or Ti 7^ 0 = T2 we know from [BBKK14] that ry^^(Ti,T2) = 0 and 
therefore (Ti,T2) = T and thus (2) is an equality. 

Case 3: Let Ti,T2 7^ 0 . We know from [BBKK14] that we have an optimal coupling 
T* e ry^n(Ti,T2], say T* = {(Vji, Vj2) G ^finX x IPfinX | j G j} for a suitable index set J. 
Then Tt = TfinTtijT*] = 7Ti[T*] = Vj2]) | j G j} = {Vji I j G j}. By optimality: 


4, CP 

(Ti,T 2) =maxd^^f^-[r] = maxd^^f‘-(Vji,Vj2] 


(3) 


Again we will make a case distinction: 

► If there is a j' G J such that Fy^^jVj/i, Vj/2] = 0 , we have d'*'^f‘’^(Vj/i, Vj/2) = T and 

X CP 

using (3) also (Ti,T2) = T which again shows that (2) is an equality. 

► Otherwise we can take optimal couplings V? G Fy^^jVji, Vj2). Continuing (3) we 
have 

(Ti,T 2 ] = maxTfvndjVM = maxmax d[Vf] ( 4 ) 

V / jej ^ jej > 


Then we define T := |vj^ | j G j| C T’flnJ’fm.jX x X). We calculate for ttf : X x X — > X 
yfin^Pfin 7 ti(T) = TfinTTim = {yfinTTijV;) | j G j} = {Vji | j G j} = Tt 


27 


P. Proofs 


and thus T G ry^^y^^(Ti,T2). Moreover we have 

^ 'Pfirx'PnndiT) = rnax(Tftnmax(?finTfin(T))) 

= max (max [J’find[T]]) = max (max {d[Vj*] | j G j}) 

= max max d [ Vf ] (5) 

jej ’ 

thus using this, (4) and (2) we conclude that 

d^^ft-^f^-(Ti,T2) ^ maxmaxd[Vf] = {Ti,T2) ^ d^^«-^ft-(Ti,T2) 

iej > \ J 

which proves equality. □ 

To verify the claims made in Example 3.4 we need the following intermediary result. 

Lemma P.3.4. For finite A and functions f, g: A —> [ 0 ,oo] zve have 

1 . de(maXaeAf(a),maXagA g(a)) ^ maXaeA de(f(a), g(a)). 

2- de (IIaeAf('2),^^g;^g(a)) ^ LagAde(f(a),g(a)). 

Proof. 1. Let Uf G argmax^^^ f(a) and Ug G argmax^^^ g(<i)/ i-e. Uf = maXaeA f 
and Ug = maXaeA g- If f(af) = g(ag] the Ihs is 0 and the inequality is satisfied. 
Lrom here we assume wlog f(af) > g(ag]. Now if f(af) = 00 , the Ihs is 00 but 
also maXagA de(f(a), g(a)] ^ de(f(af), g(af)) = cxo. Linally, for f(af) < cxo we 
have g(af) ^ g(ag) and thus de(f(af ], g(ag)) =f(af)-g(ag) ^f(af)-g(af) ^ 

maXaeA de(f(a), g(a)). 

2. Let Sf := ^QgA and Sg := HggA If Sf = Sg the Ihs is 0 and the inequality 

is satisfied. Lrom here we assume wlog Sf > Sg. Now if Sf = 00, the Ihs is 
00 but we also must have an a' G A such that f(a') = 00 (otherwise Sf < 00) 
and thus de(f(a), g(a)) ^ de(f(a'), g((i')) = oo- Linally, for Sf < cxo we 

have de(Sf,Sg) = Sf - Sg = LaeA f(< 2 ) - LaeA g(<l) = LaeA -g^ci]) ^ 

LaeA If^a) -gta)! = LaeA de(f(a),g(a)). □ 

O Example 3.4 (Input Functor). Let A be a fixed finite set of inputs. The input functor 
F = Set —)• Set maps a set X to the exponential X'^ and a function f: X —> Y 
to X'^ —)> Y'^, f'^(g) = fog. This functor preserves weak pullbacks. The two 
evaluation functions listed below are well-behaved and yield the given Wasserstein 
pseudometric on X'^ for any pseudometric space (X, d). 


evp(s) 

dI''(si,S 2 ) 

maxs(al 

maxd(si (ahs^fal'l 

aeA 

aeA ^ ' 

L s(a) 

d(si (a],S2(a)) 

aeA 

aeA 
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Proof. We first show that the functor F := on Set preserves pullbacks. If we have a 
pullback in Set as indicated in the left of the diagram below, then we have to show that 
the right diagram is a pullback. 


P 

V 2 \ 


Xi 


v^ 


f2 



A 


pA 



If 

X^ 

^ yA 


A 

1 


We consider the canonical pullback: P := {(xi,X2) G Xi X X2 I fi (xi) = f2(x2)} along 
with Pi := TTilp and 

P' := {( 91 / 92 ) G X{^ X X^ I (gi) = f2 (92)} 

= {(9I/92) G (Xi X X2)'^ I Va G A.fi (gi (a)) = f2(g2(a))} 

= {(9I/92) G (Xi X X2)'^ I Va G A.(^(gi (a),g2(a)) G p) | = P'^ 

which completes the proof of weak pullback preservation. We now show that the 
evaluation functions are well-behaved. For f: X —)• [ 0 , T] we have Ff = evp o i.e. 
applying it to g G X'^ yields max a G Af(g(a)) or f(g(a)). 

Wi. For fi,f2: X —)• [ 0 ,T] with fi ^ f2 we obviously also have Ffi ^ Ff2. 

W2. Let t G ([ 0 ,T]^)'^ and ti := 7tA(t), i.e. necessarily t = (ti,t2). We have to show 

de(evF(ti),evF(t2)) ^ Fde(t) = evF(d^(t)) = evF(de ot) = evF(de o (ti,t2)) . 

which for our evaluation functions follows from Lemma P.3.4 with f = ti, g = t2. 

W3. We have evp ^ [{ 0 }] = {g: A —^ [ 0 , T] | evF(g) = 0 }. Clearly for both functions this is 
the case only if r is the constant 0 -function. Since { 0 } is a final object in Set, there 
is a unique function z: A —> { 0 }. Thus Fi[F{ 0 }] = i'^[{ 0 }'^] = = {i o z} and 

clearly i o z: A —)• [ 0 , T] is also the constant 0 -function. 

Now if we have Si, S2 G X'^ their unique coupling is s := (si, S2) : A —)■ X x X. Moreover 
Fd(s) = evF(d'^(s)) = evF(Aa.d((si,S2))) = evF(Aa.d(si (a),S2(a)) and using the two 
different evaluation functions we obtain the given pseudometrics. □ 


P.3.2. COMPOSITIONALITY FOR MuLTIFUNCTORS 

We conclude this section with a more detailed presentation on how our theory extends 
to multifunctors. 

For n G N we denote by [u] := {!,...,n} C N the set of all positive natural 
numbers less than or equal to n. Now let nt G IN for all i G [n] and F: Set^ —)> 
Set and Gi: Set’^" —^ Set (for i G [n]) be multifunctors with evaluation functions 
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evp: F([0, T]’^) — ^ [0, T] and Gi([0, T]^") — > [0, T], We define N := ni and 

define the functor 

TL 

H := F o n Gi = F o (Gi X Gi X • • • X Gn): Set^ ^ Set 

i=l 

Then we can define the evaluation function evn: H( [0, T]^) — ^ [0, T] by 

evH := evFoF(evG,,evG2/---/evG„). 

In this setting, compositionality for the Wasserstein lifting means that whenever we 
have N pseudometric spaces (Xi, di) the pseudometric (di, ■ • • / dN) is equal to 

(^(di,...,dn, )'‘'‘"',(dn, + 1,...,dn,+n2)'‘'‘"^ ■••/(dN-TLn + l/---/dN)'‘'^’")''^ • 

In the examples in this paper we will just have the following two cases: 

1. n = 1 , ni = 2 so that F: Set Set is an endofunctor with evaluation function 
evp: F[ 0 , T] —[ 0 , T] and G: Set^ —> Set is a bifunctor with evaluation function 
evG : G([ 0 , T], [ 0 , T]j —)• [ 0 , T]. Then we have N = ni =2 and obtain the bifunctor 
H = F o G: Set^ —^ Set with evaluation cvh = evp o Fcvg : FG([ 0 , 1 ], [ 0 , 1 ]) —> [ 0 , 1 ]. 
Compositionality means that for an two pseudometric spaces (Xi, di), (X2, da) we 
have (di,d2)'*''^ = ((di, da)'*'^)'*''". 

2 . n = 2, ni = na = 1 so that F: Set^ —Set is a bifunctor with evaluation func¬ 
tion evp: F([0,T], [0,T]) —[0,T] and Gi,Ga: Set —^ Set are endofunctors with 
evaluations evG^: Gi[0,T] —)• [0,T]. Then we have N =ni-Fn 2 = 1-Fl = 2 
and obtain the bifunctor H = Fo (Gi x Ga): Set^ —> Set with evaluation cvh = 
evp o FjevGi, evG 2 ): F(Gi [0,T],G2[0,T]) —)■ [0,T]. Compositionality means that for 
an two pseudometric spaces (Xi, di), (Xa, da) we have (di, da)'*'*^ = (d|^, d^^^) 

The results presented for endofimctors work analogously in the multifunctor case (the 
proofs can be transferred almost verbatim), so we do not explicitly present them here. 

O Example 3.5 (Machine Bifunctor). Let A be a finite set of inputs, I = the input 
functor of Example 3.4, Id the identity endofunctor on Set and P be the product 
bifunctor of Example 2.10. The machine bifunctor is the composition M := P o (Id x I) 
i.e. the bifunctor M: Set^ —)• Set with M(B, X) := B x X'^. Since for Id and I there are 
unique (thus optimal) couplings we have compositionality Depending on the choices 
of evaluation function for P and I (for Id we always take id [04]) we obtain the following 
well-behaved evaluation functions evM: [0,1] x [0,1]'^ —^ [0,1]. 


Parameters 

evp(ri,r2) 

evi(s) 

evM{o,s) 

Ci,C2 G ( 0 , 1 ] 

max {ciri, Cara) 

maxs(a) 

aeA 

max< cio, C2maxs(a) > 

1 aeA J 

Cl,C2 G ( 0 , 1 ], 
Cl -F ca ^ 1 

cixi -F C2X2 

lAr'' Y. s(a) 

aeA 

Cio-FC2|A“’ Y s(a). 

aeA 
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Let (B,de), (X,d] be pseudometric spaces. For any ti,t2 G M(B,X) with ti = 
(bi,Si) E B X X'^ there is a unique and therefore necessarily optimal coupling t := 
(hi,hi, (snsi))- Depending on the evaluation function, we obtain for the first case 

(dB,d)'*''^(ti,t2) =max<^ Ci dB(bi,b2),C2 • max d(si (a), S2(a)) 

[ qga 

and for the second case 

(dB,d)'*''^(ti,t2) = cidB(bi,b2) + C2|Ar'' Y_ (<i),S2(a)) . 

qGA 

Proof. We first compute the composed evaluation functions. Let (o, s) E [ 0 , 1 ] x [ 0 , 1 ]'^, 
then 

evM(o,s) = evp oP(id[o,T],evi)(o,s) = evp o (id[o,T] x evi)(o,s) = evp (^o,evi(s]^ 

For the first case we thus have evjvi(o, s) = max{ci o, ci maXaeA s(a)} and for the 
second evM(o,s) = Cio + CilAp^ d(si (a),S2(a)) as claimed. Given ti,t2 E 

M(B,X] with ti = (hi,Si) E B x X'^ we take their unique coupling t := (hi,hi, (si,S2)) 
to compute for pseudometrics Bb on B and d on X: 

(dB,d)'*''^(ti,t2) = M(dB,d)(t) = evMoM(dB,d)(t) 

= evM o (dB X d'^) (bi,b2,(si,S2)) 

= evM(dB(bi,b2),Aa.d(si (a),S2(a))^ . 

Now if we take the two evaluation functions from above, we obtain the Wasserstein 
pseudometrics which are given in the example. □ 

O Example 3.6. Consider the machine endofunctor M2 := M( 2 ,_) = 2 x with 
evaluation function evMi: 2 x [ 0 , l]'^, (o, s) i-A c • evi(s) where c E ( 0 , 1 ] and evi is one 
of the evaluation functions for the input functor from Example 3.4. If di is the discrete 
metric on 2 and c = ci (where ci is the parameter for the evaluation function of the 
machine bifunctor as in Example 3.5) then the pseudometric obtained via the bifunctor 
lifting coincides with the one obtained by endofunctor lifting i.e. for all pseudometric 
spaces (X, d) we have (di, d)'*'^ = Moreover, although couplings for M2 do not 

always exist we have = (d'*'^^) 

Proof. We first prove that the bifunctor and endofunctor liftings coincide. Given 
ti/t 2 E 2 X X'^, say = (oi, Si), their unique M-coupling is t = (01,02, (snsi)). 

If oi 7 ^ 02 no Mi-coupling of ti,t 2 exists so we have d'*'^^(ti,t 2 ) = T but also 
d2(oi,02) = T SO (d 2 ,d)'>''^(ti,t 2 ) = M(d 2 ,d)(t) ^ Cid2(oi,02) ^ T. 
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If oi = 02 the unique M2-couplmg of ti,t2 is t' = (oi, (si,S2)) and d2(oi,02) = 0 
thus 

(d2,de)'*''^(ti,t2) = M(d2, d)(t) = C2evi(^Aa.d(si (a),S2(a))^ 

= evM2(oi,Aa.d(si (a),S2(a))^ = evM2((id2 x d'^) (oi, (si, S 2 ))) 
= = d^'^2(t,,t2). 

For compositionality we adapt the proof of Example 3.3. We know from Lemma P3.3.2. 
that 

df3’flnM2 ^ (6) 

holds. We now show that we always have equality. Let (X, d) be a pseudometric space 
and Ti,T 2 G lPfinM2X = Tfin (2 x X'^]. We distinguish three cases: 

Case 1: If Ti = T2 = 0 we know by reflexivity that both values are 0 . 

Case 2: If Ti = 0 / T2 or Ti 7^ 0 = T2 we know from [BBKK14] that ry^^(Ti,T2) = 0 and 
therefore (d'*'^^) (Ti,T2) = T and thus (6) is an equality. 

Case 3: Let Ti,T2 7^ 0 . We know from [BBKK14] that we have an optimal coupling 
T* e ry^^(Ti,T2), say T* = {((oji,Sji), (oj2,Sj2)) G M2X x M2X | j G j} for a suitable 
index set J. Then using tti: M2X x M2X —)■ M2X we have Tt = Tfin7ti(T*] = 7Tt[T*] = 
{ 7 ti((oji,Sji), (0j2,Sj2)) I j G J} = {(oji,Sji) I j G J}. By optimality: 

(Ti,T 2) =maxd^'^Mr] 

= maxd'*''^2('(o.^^Sji),(oj2,Sj2)) • (7) 


Again we will make a case distinction: 

► If there is a j' G J such that rM2 ((ojo/Sjo ], (oj/2/Sj/2]) = 0 (iff Oj/i / Oj/2), we have 

dl^^2 Sji ], (oj2/Sj2)) =T and using (7) also =t which 

again shows that (6) is an equality. 

► Otherwise for every j G J we can take the unique coupling (oji, (sji, Sj2)) G 
f’M2 ((oji/ Sji)/ (0j2/Sj2)) which is necessarily optimal. Continuing (7) we have 


__ 

(jlM2'j ^ (Ti,T 2) = maxM2d (oji,(sj 1,5,2)) 
= max 


naxevM2((id2 x d'^) (oji, (sj 1,5^2))) 
= maxevM2(oji,Aa.d(sji (a),Sj2(a))^ 


= c • rnax evi (^Aa.d(sji (a), Sj2(a 


( 8 ) 
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Then we define 

T := {(03i,(sji,Sj2)) I j G J} C TflnM2(X x X) = x (X x X)'^) . 

We calculate for Tti: X x X —)• X 

Tft,M27ttm = (id2 X Ttf ][T] = {(oji,Sjt) | j G j} = 
and thus T G ry^^M2(Ti,T2). Moreover we have 

^ aQu2d(T) = evy^^ o Tfl^evMi o TfinM2d(T) 

= max (Tfin (evM2 o M2d) (T)) = max ((evM2 o M2d)[T]) 

= max (^evM2 [(id2 x d'^) [T]] ^ 

= maxevM2(oji,Aa.d(sji (a),Sj2(a))^ 

= c-maxevi(^Aa.d(sji(a),Sj2(a))^ (9) 

thus using this, (8) and (6) we conclude that 

d'*'^f^n^^2(-j^^ ^ . inaxevi ^Aa.d(sji (a), Sj2(ci])^ 

= (Ti,T 2) ^ d^^«-'^MTi,T2) 

which proves equality. □ 

P.4. Lifting of Natural Transformations and Monads 

O Proposition 4.1 (Lifting of a Natural Transformation). Let F, G be endofunctors on Set 
with evaluation functions evp, evc and A: ¥ ^ G be a natural transformation. Then the 
following holds for all pseudometric spaces (X, d). For the Kantorovich lifting: 

1. Ifevc o A[o,t] ^ then d^'"^ o (Ax x Ax) ^ i.e. Ax is nonexpansive. 

2. If evG o A[o,t] = Ihen d"'^^ o (Ax x Ax) = i.e. Ax is an isometry, 

while for the Wasserstein lifting 

3. Ifevc o A[o,t] ^ then d'*'^ o (Ax x Ax) ^ d'*''", i.e. Ax is nonexpansive. 

4. If evQ o A[o,t] = ‘^nd the Kantorovich Rubinstein duality holds for F, i.e. d"'''" = d'*''", 

then d-*"^ o (Ax x Ax) = d'*'*", i.e. Ax is an isometry. 

Proof. Let ti,t2 G FX. 

1. By naturality of A and evQ o A[o yj < evp we obtain for every f: X —[0, T]: 

Gf o Ax = evG o Gf o Ax = cvg o A[o^y] o Ff ^ evp o Ff = Ff. (10) 
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Using this we compute 

(Ax(ti),Ax(t2)) =sup{de (Gf(Ax(ti)),Gf(Ax(t2))) |f: (X, d) ([ 0 ,TL de)} 
^ supjde (^Ff(ti],Ff(t2)) f: (X,d) -V ([ 0 ,T],de]| = d'^’'(ti,t2) • (n) 


2. We just have to replace the inequality by equality in (lo) and (ii). 

3. Naturality of A yields the following equations, where tti: X x X — )• X are the 
projections of the product and d: X x X —)• [ 0 , T] a pseudometric on X. 


Ax o Ftti = Grti o AxxX (12) 

^[o,T] ° f d, = Gd o AxxX (13) 

Using (12), we can see that AxxX maps every coupling t of ti and t2 to a cou¬ 
pling Axxx(t] of Ax(ti) and Ax(t2) because G 7 tt(Axxx(t)) = Ax(F7ri(t)) = AxlU). 
Moreover, by our requirement we obtain 

Gd(Axxx(t)) = evG o Gdo Axxx(t] = evc o A[o,t] o Fd(t) ^ evf oFd(t) = Fd(t) 

With these preparations at hand we can finally see that 

d'*'^(Ax(ti), Ax(t2)) = iof |Gd(t') | t' G Fq (Ax(ti ),Ax(t2))| 

^ inf |Gd(Axxx(t]) 1 1 G rF(ti,t2)| 

^ inf |Fd(t) I t G rF(ti,t2)| = d'>'’'(ti,t2) • 

4. Using the previous two results and the fact that Wasserstein is an upper bound 
yields: 

dtF _ (jtG Q ^ ^ ^ 4 ,G Q ^ ^ ^ 4 ,F 

and since = d'*''" all these inequalities are equalities. □ 


O Corollary 4.2 (Lifting of a Monad). Let (T,q, p) be a Set-monad and evj an evaluation 
function for T. Then the following holds. 

1. If evj o'n[o,T] ^ id[o,T] Ihen q is nonexpansive for both liftings. Hence we obtain the unit 

ff: Id T m PMet. 

2. If evj oq[o y] = id[o,T] q is an isometry for both liftings. 

3. Let d^ G d'*'^}. If evj o P[o,t] ^ ^vj oTevj and compositionality holds for TT, 
i.e. (d^)^ = d^^, then p is nonexpansive, i.e. d^ o (px x px) ^ (d^)^- This yields the 
multiplication p: TT T m PMet. 


Proof. This is an immediate consequence of Proposition 4.1. For the unit take F = Id 
with evaluation function cvf = id[o,T]/ hence d”^^ = d'*-’" = d and G = T, evQ = evj, 
A = q: Id ^ T. For the multiplication take F = TT, G = T, evF = evyj = evj o Tevj, 
evG = evj and A = p. □ 


34 




P. Proofs 


P.5. Trace Metrics in Eilenberg-Moore 

O Corollary 5.2 (Lifting of an L'M-law). Let F, G be weak pullback preserving endofunctors 
on Set with well-behaved evaluation functions evp, evc and A: FG GF fee an LM-law. If 
the evaluation functions satisfy evc o Gevp o A[o^t] ^ ^vp o Fevc and compositionality holds 
for FG, then A is nonexpansive and hence A: F G G F fs also an LM-law. 

Proof. For FG we take tFie evaluation function evpc = o Fevc and for GF the 
evaluation function evcp = evc o Gevp. We have 


evGP o ^[o,T] = 2 Vg o Gevp o A[o-p] ^ evp o Fcvg = £VpG 

By Proposition 4.1 we know that o (Ax x Ax) ^ and by Lemma P.3.3.2, we 
have (d'*''^)'!'^ ^ d'*'^'^. Plugging everything together we see that 

o (Ax X Ax) ^ d^^^^ o (Ax x Ax) ^ d^^^ = (d^^^)^^ 


which is the desired nonexpansiveness. 


□ 


O Example 5.3 (LM-law for Nondeterministic Automata). Let (Tfin/'n/h) be the finite 
powerset monad from Example 4.3. The £M-law A: Tfini^ x _^) 2 x Eftnl-)'^ is 

defined, for any set X, as 


Ax(S) = (o,Aa G A. {s'(a) 


(o',s') G S}), 


where o = 



3 s' G X^.(l,s') G S 
else 


This is exactly the one exploited for the standard powerset construction from automata 
theory [SBBR13]. Indeed, for a nondeterministic automaton c: X —)■ 2 x Tfin(X)'^, the 
map [ —1 orix assigns to each state its accepted language. Corollary 5.2 ensures that it 
is nonexpansive (see Appendix P for a detailed proof). 

Proof. The functors are a composition of known endofunctors. We have F = Tfin(2 x 
= TfinMi, and G = 2 x = M2Tfin where M2 := M( 2 ,_) is the endofunctor 
obtained from the machine bifunctor M by fixing its first component to 2 . The evaluation 
functions are evp: Tfin(2 x [ 0 , 1 ]'^) —)• [ 0 , 1 ] where for S G Tfin(2 x [ 0 , 1 ]'^) 


evp(S) = evy^^oTfiT^evM2(S) = max{evM2(o, s) | (o,s) G S} 
= max < c • maxs(a) I (o, s) G S > = c • max maxs(a 


aeA 


(o,s)eS aeA 


and evG : 2 x (Tfin[ 0 / T)'^ —^ [ 0 / T where for (o, s) G 2 x (TfinX) 


a 


evG(o,s) = evM.2 ° M2(evy^^)(o,s) = evM2(o/^Q-naaxs(a)) = c •maxmaxs(a) 

aG a 
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As we have seen in Example 3.6 we have compositionality for F. We want to apply 
Proposition 4.1.3. to show nonexpansiveness. For this we have to check that the 
inequality evc o A[o,i] ^ holds. Indeed we have: 

evG o Aro ii(S) = c • maxmax{s(a) I (o, s) G S} = c • max max s(a] = evF(S) 

' aeA aeA (o,s)eS 


which concludes the proof. 


□ 


O Example 5.4 (tM-law for Probabilistic Automata). Let be the distribution 

monad from Example 4.4 and M be the machine bifunctor from Example 3.5. There is 
a known [SBBR13] £M-law A: !D([ 0 , 1 ] x =► [ 0 , 1 ] x given by the assignment 

Ax(P)= Y. T-PlAX^j^AuG A.AxgX. ^P([0,1],s) 

\Te[0,1] seX'^,s(a)=x 

Also this £M-law is nonexpansive, as shown in Appendix P. 

Proof. We first quickly check that the definition is sound, i.e. that we get a probability 
distribution for each a G A: 


X Y = X! P([ 0 ,l],s) = 1 


xCX \seX'^,s(Q)=x 


sexA 


Fiaving verified this, we now want to show nonexpansiveness. The involved bifunctors 
F, G are given by the assignments F(B,X] = D(B x and G(B,X) = B x (DX)'^ and 
arise from composition of the distribution functor, the identity functor and the machine 
bifunctor: We have F = D o M and G = M o (Id x T>). 

Since all of these functors have optimal couplings, we have compositionality and the 
canonical evaluation functions for the composed functors are 

evp := evD o Dcvm : 1] x [0,1]'^) — ^ [0,1] and 

evG := evM o M(id[o,i],evi>): [0,1] x (DX)'^ [0,1]. 

We will now define a function Ax: T>[[ 0 , 1 ]^ x (X x X)'^) —)■ [ 0 , 1 ]^ x (T)(X x X))'^ which 
transfers F-couplings to suitable G-couplings in the following sense. For any Pi, P2 £ 
2 )([ 0 , 1 ] X X'^) and any P G rp(Pi,P2) ^ D([ 0 , 1 ] x [ 0 , 1 ] x (X x X)'^] the function Ax 
has to satisfy the following two requirements 


Ax(P] GrG(Ax(Pi),Ax(P2]) 
G(dB,d)(Ax(P)) ^F(dB,d)(P) 


(14) 

(15) 
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because then we have 

(dB,d]'^^(Ax(Pi),Ax(P2)) =inf{G(dB,d)(P') | P' G Tg (Ax(Pi ),Ax{P2))} 

^mf{G(dB,d)(Ax(P]) |PGrB(Pi,P2)} 

^ inf{F(dB,d)(P) 11 G rF(Pi,P 2 )} = (dB,d)'^^(Pi,P 2 ) 

which, due to compositionality, proves the desired nonexpansiveness of Ax. So let us 
now define Ax and prove that it satisfies the above requirements: For any set X and 
any P G I>([ 0 , 1 ] x [ 0 , 1 ] x (X x X]'^) we define Ax(P) = (oi (P), 02(P),s(P]) where 

Oi(P)= Y. T-P(h[ 0 ,l],(XxX)'^), 02(P)= Y T'-P([ 0 ,l],r,(XxX)^] and 

re [0,1] re [0,1] 

s(P): A^D(XxX], s(P)(a)(x,y) = Y P([ 0 J]^s). 

se(XxX)'^, s(a) = (x,y) 

Observe that this is completely analogous to the definition of the components Ax of our 
distributive law where for any Q G [D([ 0 , 1 ] x X'^) we have Ax(Q) = (o'(Q),s'(Q)) with 

o'(Q)= X s'(Q): A^OX, s'(Q)(a)(x)= Y 

re[0,l] seX'^,s(a]=x 

Let us now show that the above definition of Ax satisfies our requirements. We thus 
assume from here on that P G rF(Pi,P2) for some arbitrary Pi,P2 £ 2 ?([ 0 , 1 ] x X'^) i.e. 
we know F(7ti, Ttt) = Pt. In order to show (14), we have to prove that the equation 

G( 7 tt, 7 tt](Ax(P)) =Ax(Pi] (16) 

holds. The left hand side of this equation evaluates to 

G( 7 ti, 7 Tt)(Ax(P)) = (tTi X (Krti)'^) (0i (P), 02(P],s(P)) = (0i(P),[D7tiOS(P)) (17) 

and since F(7tF,7tF) = P^ the right hand side of (16) evaluates to 

Ax(Pi] = Ax(F(7Ti,7Ti)(P)) = (^0' (D(7Ti X Ttf )(P)) ,s' ([D(7ti X Ttf ](P)) ^ . (l8) 

In order to prove (16) we will thus have to show that 0' x 7T^](P)) = 0t(P] and 

also s' x 7tf'](P)) = Dui os(P) holds. We first compute 

o'([D(7tFX7tf](P)) = Y r-D(7ttX7tf)(P)(r,X^) 
re [0,1] 

= Y r-(Po(7Ti X7Tf^)“‘'[{r}xX'^]) 

re [0,1] 

= Y r-P({(oi,02,s) G D([ 0 , 1 ]^ X (XxX)'^) I Tti X7Tf(oi,02,s) gHxX'^}) 
re [0,1] 

= Y r-P ({(oi,02,s) G D([ 0 , 1 ]^ X (X X X]''^) I Oi = r}) = 0i(P) 

re [0,1] 
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showing that indeed the first components of the tuples in (17) and (18) are the same. 
For the second components we have 

s'(D(7ti X Trf )(P)) = Y_ 2)(7ti x nf )(P]([ 0 , l],s) 

{seX'^,s(Q)=x} 

= H P ({(oi,02,s') G X I Ttt X Ttf (oi,02,s') G [ 0 , 1 ] X {s}}) 

seX^,s (a)=x 

= H P ({(oi,02,s') G D([ 0 , 1 ]^ X I TTios' = s}) 

seX'^,s(a)=x 

Y_ P([ 0 ,l]^s') 

s'e(XxX)^,7tiOs'(a)=x 

and 

(D7riOs(P))(a)(x) = s(P)(a) o [{x}] =s(P)(a) ({y G X x X | 7ri(y) = x}) 

= Y_ P([ 0 ,l]^,s) (19) 

se(XxX)'',7tios(a)=x 

which shows that also the second components of (17) and (18) coincide. Therefore (16) 
holds i.e. we have proved Ax(P] G rG(Ax(Pi ),Ax(P2]) as claimed in (14). We now 
show (15). For the left hand side of that inequality we compute 

G(dB,d](Ax(P)) = (evGoG(dB,d))(Ax(P)) = evG (G(dB, d) (Ax(P))) 

= evG ((dB X (Dd)^) (01 (P), 02(P),s(P))) 

= evG(dB(oi(P),02(P)),Aa.Dd(s(P)(a))) 

= (evM o M (id[o,i],evD) ) (^dB(oi (P),02(P)),Aa.Dd(s(P)(a))^ 

= evM (m (id[04],evn) (^dB(oi (P),02(P)),Aa.I)d(s(P](a))^^ 

= evM ^dB(oi (P),02(P)),ev^ (^Aa.Dd(s(P)(a))^^ 

= evM ^dB(oi (P),02(P)),Aa.evn(^Dd(s(P)(a))^^ 

= cidB(oi(P),02(P)) + ^ Y_ evn(Dd(s(P](a))) (20) 

' ' aeA 
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and since for each a G A we have 

evi,(^!Dd(s(P)(a))^ = Y_ t •s(P)(a)(d“^ [{r}]) = Y_ • s(P)(a)(x,y] 

re [0,1] (x,y)eX2 

= Y_ d(x,-y)-| Y_ 

(x,y]eX2 \se(XxX)A, s(a) = (x,y) 

= Y_ d(s(a)) • P([ 0 ,l]^,s) 

se(XxX)'' 

we may continue (20) as follows: 

G(dB,d)(Ax(P)) =cidB(oi(P),02(P)) + ^ X Z d(s(a)) •P([ 0 J]^s). (21) 

aeAse(XxX]A 


For the right hand side of (15) we have 

F(dB,d)(P) = (evFoF(dB,d))(P) = ((evi, o !DevM) o D(M(dB, d 
= evD I^DevM [ t ) (M(dB, d)) (P))^ = evo, (^O^evM (dB x d'^) (P))^ 

= evD ^(^2)(dB X d'^)(P)^ o t • (^^(dB x d'^) (P)^ (eVj,^M{r}]) 


re [0,1] 


Y_ evM(o,s') • (2)(dB X d'^)(P)^(o,s'] 

(o,s')e[o,i]x[o,i]A 

y evM(o,s') • P((dB X d'^)“^ [{(o,s' 

(o,s')e[o,i]x[o,i]A 

y eVM((dB X d'^)(oi, 02 ,s]^ • P(oi, 02 ,s) 

(o,,O2,s)e[0,1]2x(XxX)A 

y evM 

(o,,O2,s)e[0,1]2x(XxX)A 


z 

(o,,O2,s)e[0,1]2x(XxX)A 

|A| 


s(a))j • P(oi,02,s) 

{ CidB( 0 i, 02 ) + Z I ■ P(Ol, 02 ,s) 




aCA 


= ClO(P)+|^ Z X 

se(XxX)A aeA 


with 


0 (P)= y_ dB(oi,02)-P(oi,02,(XxX]^ 
(oi/O2)e[0,i]^ 


(22) 
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Comparing (21) and (22) we see that in order to obtain inequality (15) we just have to 
show 


dB(oi(P), 02 (P)) ^ o(P) 

This is easily done using the fact that de = de is the euclidean metric and the triangle 
inequality: 

dB(0l(P),02(P)) = 

= Y. i'TP(ri,[OJ],(XxX)'^)- Y r2-P([0J],r2,(XxX)^) 

r,e[0,l] r 2 e[ 0 J] 

= Y (X X X)-^) - Y 1*2 •P(T'i,r2, (X X X)-^) 

i'l,T2e[0,1] ri,r2e[0,l] 

= Y (n-r2)-P(ri,r2,(XxX]'^) 

ri,T2e[0,1] 

^ Y |(T'1 “T'2) • P(T'nT'2,(X X X)'^)| 

Tl,r 2 e[ 0 ,l] 

= Y lri-r2|-P(ror2,(XxX)'^) =o(P). 

ri,r2e[0,l] 

We have thus also completed the proof of our second claim: the inequality (15). □ 
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